Additions to wiki - Model Interpretation Documentation by TimCookCountyDS · Pull Request #75 · ccao-data/wiki

TimCookCountyDS · 2026-05-22T20:04:33Z

No description provided.

copy-edit rmv training

copy edit 2

heading

copy edits

outline edit

update broke assessment metrics links

Edit links

edit hyperlinks 2

close parens

Save link in readme

wrridgeway · 2026-05-26T14:18:01Z

Could we resolve merge conflicts and get at least a tiny description for this PR before review?

Damonamajor · 2026-05-26T18:58:01Z

My main thought is if we want to include key aspects of this in the checklist?

TimCookCountyDS · 2026-05-27T14:19:02Z

My main thought is if we want to include key aspects of this in the checklist?

@Damonamajor - definitely aligned with that sentiment- My thought is we could just link this document in the checklist- with reference to specific sections? I can go ahead and do that, and then call this finished.

Damonamajor · 2026-06-01T14:29:07Z

+
+### A. Balance Tests
+
+*(See the "Statistical Tests" section of the model performance report.)*  In a perfectly matched sample, no feature would predict inclusion/exclusion of a property in the sales-sample. Any feature that predicts inclusion in the sales set at a level greater than chance (statistical significance) suggests that this feature is over-or under-represented in the sample and will likely bias your results.  (This is especially the case for features that also turn out to have high shap values in your results).  To check this, we run a simple logistic regression predicting the likelihood-of-a-sale, given a property's features. The resulting p values (for each feature in the report) tells you that a feature predicts inclusion in the sample at a level greater than expected-due-to-chance, while the Beta value gives you a relative sense of the weight (importance) and direction (include vs exclude) of that feature.  (In our report, asterisks, represent statistically significant predictors).  (Low p-values suggest statistical significance, high magnitudes for the Betas suggest a large impact).  When a feature is predictive of inclusion in the sample, this means that your sample is likely biased towards properties with this feature, and may thus value these, or other properties inaccurately.


I also like the note of where we can find this. It may be a bit too nitty to do this for every section, but maybe under the big headers, note which sections of reports we can find the different interpretations.

fixed link

fix bias variance links

ccao-jardine

This is solid, thanks to everyone who has written and pitched in to review! I'm here to add my own comments. Nitpicks are optional; anything not marked as a nitpick I'd like to discuss or resolve.

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

corrected inaccurate links for lgbm missingness handling

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

fixed progress and poverty link

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

TimCookCountyDS · 2026-06-09T19:35:50Z

@ccao-jardine - feedback incorporated. Let me know if there's anything else needed before I merge?

wrridgeway

Looking good! I'll do another review once we've added something like a terms section to make this a bit easier to read through.

wrridgeway · 2026-06-10T16:06:22Z

+**Overview:**
+
+1. Assessing how representative your sales sample is of the assessment set.
+   - a. Balance tests
+   - b. Visual inspection
+   - c. Not missing at random
+   - d. Domain specific approach
+
+2. Noting any real-world housing market changes that may impact your model, and/or interactions between data and model that may affect your results (model drift, data drift).
+
+3. Interpreting model performance (evaluating machine learning and assessment metrics).
+


Suggested change

**Overview:**

1. Assessing how representative your sales sample is of the assessment set.

- a. Balance tests

- b. Visual inspection

- c. Not missing at random

- d. Domain specific approach

2. Noting any real-world housing market changes that may impact your model, and/or interactions between data and model that may affect your results (model drift, data drift).

3. Interpreting model performance (evaluating machine learning and assessment metrics).

There is an "Outline" button next to markdown files that already provides this feature in a really clean way:

Or, if we're committed to this outline, i'd link to the sections through it using section links.

wrridgeway · 2026-06-10T16:12:14Z

I would replace this outline with a "terms" section and then try to clean up the constant switching between population, sample, sales, and assessment. It's a lot of parentheses and extra language that we could get out of the way super quick and then use a couple small, consistent terms throughout. I tried to clean this up in the word doc but perhaps I made it worse.

Useful terms

Sample: the universe of parcel sales we use to train and test our model

Population: the universe of parcels that the model needs to value

etc...

Great call. I think this is actually really important to be clear on- (population, sample, sales, and assessment.) - as it can be a source of confusion when discussing different model outputs and types of evaluation (especially with regard to differences between ml evaluation and domain specific evaluation).

Co-authored-by: William Ridgeway <10358980+wrridgeway@users.noreply.github.com>

Tim Sparer and others added 16 commits May 21, 2026 22:49

Add first draft of s.o.p for model evaluation

9106cbc

Update Model-Evaluation-ML-metrics.md

1d656fc

copy-edit rmv training

Update Model-Evaluation-ML-metrics.md

ceea4df

copy edit 2

Update Model-Evaluation-ML-metrics.md

a632697

Update Model-Evaluation-ML-metrics.md

ea84cff

heading

Update Model-Evaluation-ML-metrics.md

41e659f

Update Model-Evaluation-ML-metrics.md

feb5527

copy edits

Update Model-Evaluation-ML-metrics.md

815be14

outline edit

Update Model-Evaluation-ML-metrics.md

9ca3b0b

update broke assessment metrics links

Update Model-Evaluation-ML-metrics.md

c2d46f3

Update Model-Evaluation-ML-metrics.md

037f4ec

Update Model-Evaluation-ML-metrics.md

21a6aa5

Update Model-Evaluation-ML-metrics.md

deeaec7

Update Model-Evaluation-ML-metrics.md

9b6de2c

Edit links

Update Model-Evaluation-ML-metrics.md

202745d

edit hyperlinks 2

Update Model-Evaluation-ML-metrics.md

748e942

close parens

TimCookCountyDS requested a review from Copilot May 22, 2026 20:04

Copilot started reviewing on behalf of TimCookCountyDS May 22, 2026 20:04 View session

TimCookCountyDS requested review from Damonamajor, wagnerlmichael and wrridgeway and removed request for Copilot May 22, 2026 20:04

Update README.md

9ab70d8

Save link in readme

TimCookCountyDS linked an issue May 27, 2026 that may be closed by this pull request

Additions to Model template for model scoring and interpretation #76

Open

4 tasks

ccao-jardine self-requested a review May 27, 2026 15:27

Damonamajor reviewed Jun 1, 2026

View reviewed changes

Updated model eval wiki with BIlly's edits

3a61cb2

TimCookCountyDS added 2 commits June 5, 2026 18:43

Update Model-Evaluation-ML-metrics.md

7ebc982

fixed link

Update Model-Evaluation-ML-metrics.md

83a08ab

fix bias variance links

ccao-jardine requested changes Jun 8, 2026

View reviewed changes

TimCookCountyDS and others added 12 commits June 8, 2026 14:43

Update SOPs/Model-Evaluation-ML-metrics.md

fa8a18e

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Update SOPs/Model-Evaluation-ML-metrics.md

f29ab19

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Update SOPs/Model-Evaluation-ML-metrics.md

91f0a8e

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Update SOPs/Model-Evaluation-ML-metrics.md

3c7a57b

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Update SOPs/Model-Evaluation-ML-metrics.md

a806167

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Update SOPs/Model-Evaluation-ML-metrics.md

e856513

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Update Model-Evaluation-ML-metrics.md

52fd82d

corrected inaccurate links for lgbm missingness handling

Update SOPs/Model-Evaluation-ML-metrics.md

e6fffa8

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Update Model-Evaluation-ML-metrics.md

e0f0765

fixed progress and poverty link

Apply suggestion from @ccao-jardine

68ff750

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Apply suggestion from @ccao-jardine

f5e5d1a

Co-authored-by: Nicole Jardine <138712135+ccao-jardine@users.noreply.github.com>

Merge branch 'master' into tim_wiki_model_interp

cea9e47

wrridgeway requested changes Jun 10, 2026

View reviewed changes

ccao-jardine mentioned this pull request Jun 17, 2026

Create model selection SOP #1

Open

4 tasks

Update SOPs/Model-Evaluation-ML-metrics.md

267b1e4

Co-authored-by: William Ridgeway <10358980+wrridgeway@users.noreply.github.com>


		### A. Balance Tests

		(See the "Statistical Tests" section of the model performance report.) In a perfectly matched sample, no feature would predict inclusion/exclusion of a property in the sales-sample. Any feature that predicts inclusion in the sales set at a level greater than chance (statistical significance) suggests that this feature is over-or under-represented in the sample and will likely bias your results. (This is especially the case for features that also turn out to have high shap values in your results). To check this, we run a simple logistic regression predicting the likelihood-of-a-sale, given a property's features. The resulting p values (for each feature in the report) tells you that a feature predicts inclusion in the sample at a level greater than expected-due-to-chance, while the Beta value gives you a relative sense of the weight (importance) and direction (include vs exclude) of that feature. (In our report, asterisks, represent statistically significant predictors). (Low p-values suggest statistical significance, high magnitudes for the Betas suggest a large impact). When a feature is predictive of inclusion in the sample, this means that your sample is likely biased towards properties with this feature, and may thus value these, or other properties inaccurately.

Uh oh!

Conversation

TimCookCountyDS commented May 22, 2026

Uh oh!

wrridgeway commented May 26, 2026

Uh oh!

Damonamajor commented May 26, 2026

Uh oh!

TimCookCountyDS commented May 27, 2026

Uh oh!

Damonamajor Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

ccao-jardine left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TimCookCountyDS commented Jun 9, 2026

Uh oh!

wrridgeway left a comment

Choose a reason for hiding this comment

Uh oh!

wrridgeway Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wrridgeway Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TimCookCountyDS Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wrridgeway Jun 10, 2026 •

edited

Loading

wrridgeway Jun 10, 2026 •

edited

Loading

TimCookCountyDS Jun 25, 2026 •

edited

Loading